Goto

Collaborating Authors

 uniform separation rate




Efficient Aggregated Kernel Tests using Incomplete U-statistics

Neural Information Processing Systems

This procedure provides a solution to the fundamental kernel selection problem as we can aggregate a large number of kernels with several bandwidths without incurring a significant loss of test power.


A Unified View of Optimal Kernel Hypothesis Testing

Schrab, Antonin

arXiv.org Machine Learning

This paper provides a unifying view of optimal kernel hypothesis testing across the MMD two-sample, HSIC independence, and KSD goodness-of-fit frameworks. Minimax optimal separation rates in the kernel and $L^2$ metrics are presented, with two adaptive kernel selection methods (kernel pooling and aggregation), and under various testing constraints: computational efficiency, differential privacy, and robustness to data corruption. Intuition behind the derivation of the power results is provided in a unified way accross the three frameworks, and open problems are highlighted.


Efficient Aggregated Kernel Tests using Incomplete $U$-statistics

Schrab, Antonin, Kim, Ilmun, Guedj, Benjamin, Gretton, Arthur

arXiv.org Artificial Intelligence

We propose a series of computationally efficient nonparametric tests for the two-sample, independence, and goodness-of-fit problems, using the Maximum Mean Discrepancy (MMD), Hilbert Schmidt Independence Criterion (HSIC), and Kernel Stein Discrepancy (KSD), respectively. Our test statistics are incomplete $U$-statistics, with a computational cost that interpolates between linear time in the number of samples, and quadratic time, as associated with classical $U$-statistic tests. The three proposed tests aggregate over several kernel bandwidths to detect departures from the null on various scales: we call the resulting tests MMDAggInc, HSICAggInc and KSDAggInc. This procedure provides a solution to the fundamental kernel selection problem as we can aggregate a large number of kernels with several bandwidths without incurring a significant loss of test power. For the test thresholds, we derive a quantile bound for wild bootstrapped incomplete $U$-statistics, which is of independent interest. We derive non-asymptotic uniform separation rates for MMDAggInc and HSICAggInc, and quantify exactly the trade-off between computational efficiency and the attainable rates: this result is novel for tests based on incomplete $U$-statistics, to our knowledge. We further show that in the quadratic-time case, the wild bootstrap incurs no penalty to test power over the more widespread permutation-based approach, since both attain the same minimax optimal rates (which in turn match the rates that use oracle quantiles). We support our claims with numerical experiments on the trade-off between computational efficiency and test power. In all three testing frameworks, the linear-time versions of our proposed tests perform at least as well as the current linear-time state-of-the-art tests.


KSD Aggregated Goodness-of-fit Test

Schrab, Antonin, Guedj, Benjamin, Gretton, Arthur

arXiv.org Machine Learning

We investigate properties of goodness-of-fit tests based on the Kernel Stein Discrepancy (KSD). We introduce a strategy to construct a test, called KSDAgg, which aggregates multiple tests with different kernels. KSDAgg avoids splitting the data to perform kernel selection (which leads to a loss in test power), and rather maximises the test power over a collection of kernels. We provide theoretical guarantees on the power of KSDAgg: we show it achieves the smallest uniform separation rate of the collection, up to a logarithmic term. KSDAgg can be computed exactly in practice as it relies either on a parametric bootstrap or on a wild bootstrap to estimate the quantiles and the level corrections. In particular, for the crucial choice of bandwidth of a fixed kernel, it avoids resorting to arbitrary heuristics (such as median or standard deviation) or to data splitting. We find on both synthetic and real-world data that KSDAgg outperforms other state-of-the-art adaptive KSD-based goodness-of-fit testing procedures.


MMD Aggregated Two-Sample Test

Schrab, Antonin, Kim, Ilmun, Albert, Mélisande, Laurent, Béatrice, Guedj, Benjamin, Gretton, Arthur

arXiv.org Machine Learning

We propose a novel nonparametric two-sample test based on the Maximum Mean Discrepancy (MMD), which is constructed by aggregating tests with different kernel bandwidths. This aggregation procedure, called MMDAgg, ensures that test power is maximised over the collection of kernels used, without requiring held-out data for kernel selection (which results in a loss of test power), or arbitrary kernel choices such as the median heuristic. We work in the non-asymptotic framework, and prove that our aggregated test is minimax adaptive over Sobolev balls. Our guarantees are not restricted to a specific kernel, but hold for any product of one-dimensional translation invariant characteristic kernels which are absolutely and square integrable. Moreover, our results apply for popular numerical procedures to determine the test threshold, namely permutations and the wild bootstrap. Through numerical experiments on both synthetic and real-world datasets, we demonstrate that MMDAgg outperforms alternative state-of-the-art approaches to MMD kernel adaptation for two-sample testing.


Minimax optimal goodness-of-fit testing for densities under a local differential privacy constraint

Lam-Weil, Joseph, Laurent, Béatrice, Loubes, Jean-Michel

arXiv.org Machine Learning

Finding anonymization mechanisms to protect personal data is at the heart of machine learning research. Here we consider the consequences of local differential privacy constraints on goodness-of-fit testing, i.e. the statistical problem assessing whether sample points are generated from a fixed density $f_0$, or not. The observations are hidden and replaced by a stochastic transformation satisfying the local differential privacy constraint. In this setting, we propose a new testing procedure which is based on an estimation of the quadratic distance between the density $f$ of the unobserved sample and $f_0$. We establish minimax separation rates for our test over Besov balls. We also provide a lower bound, proving the optimality of our result. To the best of our knowledge, we provide the first minimax optimal test and associated private transformation under a local differential privacy constraint, quantifying the price to pay for data privacy.